Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[test] Add a benchmark for SFG edges' two-level map #1927

Merged
merged 5 commits into from
Oct 10, 2020

Conversation

k-ye
Copy link
Member

@k-ye k-ye commented Oct 7, 2020

This adds the benchmark for five kind of map implementations:

using UnorderedMapSet = std::unordered_map<int, std::unordered_set<int>>;
using LLVMVecSet = llvm::SmallVector<std::pair<int, llvm::SmallSet<int, 4>>, 4>;
using LLVMVecVec =
    llvm::SmallVector<std::pair<int, llvm::SmallVector<int, 4>>, 4>;
using StlVecSet = std::vector<std::pair<int, std::unordered_set<int>>>;
using StlVecVec = std::vector<std::pair<int, std::vector<int>>>;

The test data might not be very representative of the actual use pattern, but it seems to indicate that llvm::SmallVector<std::pair<Key, llvm::SmallSet<Value>> is a pretty decent data structure? (I'm too lazy to implement the sorted Vector<std::pair<Key, Value>>. That data structure would require us to sort it for every single insert, otherwise the dedupe process would not be very efficient...)

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[Profiler thread 0x119f55dc0]
    277.281 us UnorderedMapSet insert        [250 x   1.109 us]
    170.469 us UnorderedMapSet lookup found  [250 x 681.877 ns]
     74.625 us UnorderedMapSet lookup not found [100 x 746.250 ns]
    154.495 us LLVMVecSet insert             [250 x 617.981 ns]
    196.218 us LLVMVecSet lookup found       [250 x 784.874 ns]
     75.340 us LLVMVecSet lookup not found   [100 x 753.403 ns]
    168.562 us LLVMVecVec insert             [250 x 674.248 ns]
    222.445 us LLVMVecVec lookup found       [250 x 889.778 ns]
     92.030 us LLVMVecVec lookup not found   [100 x 920.296 ns]
    367.641 us StlVecSet insert              [250 x   1.471 us]
    216.961 us StlVecSet lookup found        [250 x 867.844 ns]
    110.626 us StlVecSet lookup not found    [100 x   1.106 us]
    319.004 us StlVecVec insert              [250 x   1.276 us]
    234.365 us StlVecVec lookup found        [250 x 937.462 ns]
    121.355 us StlVecVec lookup not found    [100 x   1.214 us]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Related issue = #

[Click here for the format server]


@yuanming-hu
Copy link
Member

Wow this is interesting to see! It's surprising that LLVMSmallX still takes 600+ ns on these operations - I thought it should be something < 50 ns :-)

Copy link
Contributor

@xumingkuan xumingkuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I'm also surprised to see that there's nothing < 50 ns. Here's my two cents:

  • What's the overhead of TI_PROFILER -- what about an empty lookup/insert function with the same parameters?
  • What about llvm::SmallVector/SmallSet<std::pair<int, int>, 16>? I wonder if the indirection caused by nested data structures is the bottleneck.

@k-ye
Copy link
Member Author

k-ye commented Oct 9, 2020

OK i added Empty, FlattenSet and FlattenVec. Here are the numbers:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[Profiler thread 0x1113c0dc0]
     72.956 us Empty insert                  [250 x 291.824 ns]
     48.161 us Empty lookup found            [250 x 192.642 ns]
     26.226 us Empty lookup not found        [100 x 262.260 ns]

    223.398 us UnorderedMapSet insert        [250 x 893.593 ns]
    164.032 us UnorderedMapSet lookup found  [250 x 656.128 ns]
     68.426 us UnorderedMapSet lookup not found [100 x 684.261 ns]

    144.243 us LLVMVecSet insert             [250 x 576.973 ns]
    204.086 us LLVMVecSet lookup found       [250 x 816.345 ns]
     74.863 us LLVMVecSet lookup not found   [100 x 748.634 ns]

    160.694 us LLVMVecVec insert             [250 x 642.776 ns]
    208.855 us LLVMVecVec lookup found       [250 x 835.419 ns]
     90.599 us LLVMVecVec lookup not found   [100 x 905.991 ns]

    302.076 us StlVecSet insert              [250 x   1.208 us]
    201.941 us StlVecSet lookup found        [250 x 807.762 ns]
     96.798 us StlVecSet lookup not found    [100 x 967.979 ns]

    286.818 us StlVecVec insert              [250 x   1.147 us]
    226.021 us StlVecVec lookup found        [250 x 904.083 ns]
    105.381 us StlVecVec lookup not found    [100 x   1.054 us]

    314.236 us FlattenSet insert             [250 x   1.257 us]
    299.931 us FlattenSet lookup found       [250 x   1.200 us]
    114.918 us FlattenSet lookup not found   [100 x   1.149 us]

    300.884 us FlattenVec insert             [250 x   1.204 us]
    318.050 us FlattenVec lookup found       [250 x   1.272 us]
    124.693 us FlattenVec lookup not found   [100 x   1.247 us]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

What's the overhead of TI_PROFILER -- what about an empty lookup/insert function with the same parameters?

That's a great point.. Empty shows that profiling does incur some cost, though subtracting the cost still couldn't bring down the numbers anywhere near 50ns... That said, I found these numbers to have a pretty large variance. E.g. for LLVMVecSet, insert went from 450 ~ 700. So maybe it's better to run this on the MIT's server?

$ ti test -c benchmark_sfg

What about llvm::SmallVector/SmallSet<std::pair<int, int>, 16>? I wonder if the indirection caused by nested data structures is the bottleneck.

They didn't really help :-(

@xumingkuan
Copy link
Contributor

Here's the result on my laptop:

?[0m?[38;2;000;255;255m>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
?[0m?[38;2;255;000;000m[Profiler thread 19504]
?[0m    ?[38;2;255;000;000m 65.000 us Empty insert                 ?[0m?[38;2;000;255;255m [250 x 260.000 ns]
?[0m    ?[38;2;255;000;000m 46.000 us Empty lookup found           ?[0m?[38;2;000;255;255m [250 x 183.999 ns]
?[0m    ?[38;2;255;000;000m 15.000 us Empty lookup not found       ?[0m?[38;2;000;255;255m [100 x 150.001 ns]
?[0m    ?[38;2;255;000;000m 89.000 us UnorderedMapSet insert       ?[0m?[38;2;000;255;255m [250 x 356.000 ns]
?[0m    ?[38;2;255;000;000m 52.000 us UnorderedMapSet lookup found ?[0m?[38;2;000;255;255m [250 x 208.001 ns]
?[0m    ?[38;2;255;000;000m 33.000 us UnorderedMapSet lookup not found?[0m?[38;2;000;255;255m [100 x 330.000 ns]
?[0m    ?[38;2;255;000;000m 77.000 us LLVMVecSet insert            ?[0m?[38;2;000;255;255m [250 x 308.000 ns]
?[0m    ?[38;2;255;000;000m 49.000 us LLVMVecSet lookup found      ?[0m?[38;2;000;255;255m [250 x 196.001 ns]
?[0m    ?[38;2;255;000;000m 22.000 us LLVMVecSet lookup not found  ?[0m?[38;2;000;255;255m [100 x 219.999 ns]
?[0m    ?[38;2;255;000;000m 64.000 us LLVMVecVec insert            ?[0m?[38;2;000;255;255m [250 x 256.000 ns]
?[0m    ?[38;2;255;000;000m 67.000 us LLVMVecVec lookup found      ?[0m?[38;2;000;255;255m [250 x 268.000 ns]
?[0m    ?[38;2;255;000;000m 20.000 us LLVMVecVec lookup not found  ?[0m?[38;2;000;255;255m [100 x 199.998 ns]
?[0m    ?[38;2;255;000;000m128.000 us StlVecSet insert             ?[0m?[38;2;000;255;255m [250 x 511.999 ns]
?[0m    ?[38;2;255;000;000m 63.000 us StlVecSet lookup found       ?[0m?[38;2;000;255;255m [250 x 252.001 ns]
?[0m    ?[38;2;255;000;000m 22.000 us StlVecSet lookup not found   ?[0m?[38;2;000;255;255m [100 x 220.002 ns]
?[0m    ?[38;2;255;000;000m111.000 us StlVecVec insert             ?[0m?[38;2;000;255;255m [250 x 444.000 ns]
?[0m    ?[38;2;255;000;000m 60.000 us StlVecVec lookup found       ?[0m?[38;2;000;255;255m [250 x 240.000 ns]
?[0m    ?[38;2;255;000;000m 22.000 us StlVecVec lookup not found   ?[0m?[38;2;000;255;255m [100 x 219.998 ns]
?[0m    ?[38;2;255;000;000m 86.000 us FlattenSet insert            ?[0m?[38;2;000;255;255m [250 x 344.001 ns]
?[0m    ?[38;2;255;000;000m 85.000 us FlattenSet lookup found      ?[0m?[38;2;000;255;255m [250 x 339.999 ns]
?[0m    ?[38;2;255;000;000m 41.000 us FlattenSet lookup not found  ?[0m?[38;2;000;255;255m [100 x 410.002 ns]
?[0m    ?[38;2;255;000;000m 70.000 us FlattenVec insert            ?[0m?[38;2;000;255;255m [250 x 280.000 ns]
?[0m    ?[38;2;255;000;000m 84.000 us FlattenVec lookup found      ?[0m?[38;2;000;255;255m [250 x 336.001 ns]
?[0m    ?[38;2;255;000;000m 27.000 us FlattenVec lookup not found  ?[0m?[38;2;000;255;255m [100 x 270.000 ns]
?[0m?[38;2;000;255;255m>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
?[0m===============================================================================
All tests passed (2550 assertions in 1 test case)

@xumingkuan
Copy link
Contributor

xumingkuan commented Oct 9, 2020

Result on kun:

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[Profiler thread 140048205031232]
    857.353 us Empty insert                  [25000 x  34.294 ns]
      1.344 ms Empty lookup found            [25000 x  53.749 ns]
    529.528 us Empty lookup not found        [10000 x  52.953 ns]
      3.287 ms UnorderedMapSet insert        [25000 x 131.464 ns]
      2.302 ms UnorderedMapSet lookup found  [25000 x  92.087 ns]
    823.498 us UnorderedMapSet lookup not found [10000 x  82.350 ns]
      1.943 ms LLVMVecSet insert             [25000 x  77.734 ns]
      1.981 ms LLVMVecSet lookup found       [25000 x  79.260 ns]
    738.382 us LLVMVecSet lookup not found   [10000 x  73.838 ns]
      1.960 ms LLVMVecVec insert             [25000 x  78.392 ns]
      2.096 ms LLVMVecVec lookup found       [25000 x  83.847 ns]
    793.457 us LLVMVecVec lookup not found   [10000 x  79.346 ns]
      2.981 ms StlVecSet insert              [25000 x 119.247 ns]
      2.493 ms StlVecSet lookup found        [25000 x  99.707 ns]
    835.180 us StlVecSet lookup not found    [10000 x  83.518 ns]
      2.334 ms StlVecVec insert              [25000 x  93.346 ns]
      2.215 ms StlVecVec lookup found        [25000 x  88.615 ns]
    805.140 us StlVecVec lookup not found    [10000 x  80.514 ns]
      2.606 ms FlattenSet insert             [25000 x 104.256 ns]
      2.362 ms FlattenSet lookup found       [25000 x  94.461 ns]
    866.890 us FlattenSet lookup not found   [10000 x  86.689 ns]
      2.422 ms FlattenVec insert             [25000 x  96.893 ns]
      2.311 ms FlattenVec lookup found       [25000 x  92.459 ns]
    917.196 us FlattenVec lookup not found   [10000 x  91.720 ns]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

@codecov
Copy link

codecov bot commented Oct 9, 2020

Codecov Report

Merging #1927 into master will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1927      +/-   ##
==========================================
- Coverage   43.76%   43.72%   -0.04%     
==========================================
  Files          45       45              
  Lines        6202     6207       +5     
  Branches     1101     1103       +2     
==========================================
  Hits         2714     2714              
- Misses       3319     3322       +3     
- Partials      169      171       +2     
Impacted Files Coverage Δ
python/taichi/core/util.py
python/taichi/misc/util.py
python/taichi/lang/matrix.py
python/taichi/core/__init__.py
python/taichi/tools/__init__.py
python/taichi/lang/core.py
python/taichi/testing.py
python/taichi/lang/__init__.py
python/taichi/misc/gui.py
python/taichi/tools/video.py
... and 80 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a5035d...21b26ab. Read the comment docs.

@yuanming-hu yuanming-hu merged commit 48e5f48 into taichi-dev:master Oct 10, 2020
@yuanming-hu yuanming-hu mentioned this pull request Oct 10, 2020
@k-ye k-ye deleted the bm-edges branch October 18, 2020 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants